智能论文笔记

MEDS-Net: Self-Distilled Multi-Encoders Network with Bi-Direction Maximum Intensity projections for Lung Nodule Detection

Muhammad Usman , Azka Rehman , Abdullah Shahid , Siddique Latif , Shi Sub Byon , Byoung Dai Lee , Sung Hyun Kim , Byung il Lee , Yeong Gil Shin

分类：计算机视觉

2022-10-30

In this study, we propose a lung nodule detection scheme which fully incorporates the clinic workflow of radiologists. Particularly, we exploit Bi-Directional Maximum intensity projection (MIP) images of various thicknesses (i.e., 3, 5 and 10mm) along with a 3D patch of CT scan, consisting of 10 adjacent slices to feed into self-distillation-based Multi-Encoders Network (MEDS-Net). The proposed architecture first condenses 3D patch input to three channels by using a dense block which consists of dense units which effectively examine the nodule presence from 2D axial slices. This condensed information, along with the forward and backward MIP images, is fed to three different encoders to learn the most meaningful representation, which is forwarded into the decoded block at various levels. At the decoder block, we employ a self-distillation mechanism by connecting the distillation block, which contains five lung nodule detectors. It helps to expedite the convergence and improves the learning ability of the proposed architecture. Finally, the proposed scheme reduces the false positives by complementing the main detector with auxiliary detectors. The proposed scheme has been rigorously evaluated on 888 scans of LUNA16 dataset and obtained a CPM score of 93.6\%. The results demonstrate that incorporating of bi-direction MIP images enables MEDS-Net to effectively distinguish nodules from surroundings which help to achieve the sensitivity of 91.5% and 92.8% with false positives rate of 0.25 and 0.5 per scan, respectively.

translated by 谷歌翻译

Multi-View Attention Transfer for Efficient Speech Enhancement

Wooseok Shin , Hyun Joon Park , Jin Sob Kim , Byung Hoon Lee , Sung Won Han

分类：机器学习

2022-08-22

最近的深度学习模型在言语增强方面已经达到了高性能。但是，获得快速和低复杂模型而没有明显的性能降解仍然是一项挑战。以前的知识蒸馏研究对言语增强无法解决这个问题，因为它们的输出蒸馏方法在某些方面不符合语音增强任务。在这项研究中，我们提出了基于特征的蒸馏多视图注意转移（MV-AT），以在时域中获得有效的语音增强模型。基于多视图功能提取模型，MV-AT将教师网络的多视图知识传输到学生网络，而无需其他参数。实验结果表明，所提出的方法始终提高瓦伦蒂尼和深噪声抑制（DNS）数据集的各种规模的学生模型的性能。与基线模型相比，使用我们提出的方法（一种用于有效部署的轻巧模型）分别使用了15.4倍和4.71倍（FLOPS），与具有相似性能的基线模型相比，Many-S-8.1GF分别达到了15.4倍和4.71倍。

translated by 谷歌翻译

Adaptive Model Pooling for Online Deep Anomaly Detection from a Complex Evolving Data Stream

Susik Yoon , Youngjun Lee , Jae-Gil Lee , Byung Suk Lee

分类：机器学习 | 人工智能

2022-06-09

来自数据流的在线异常检测对于许多应用程序的安全性至关重要，但是由于来自IoT设备和基于云的基础架构的复杂且不断发展的数据流而面临严重的挑战。不幸的是，现有方法对这些挑战太短。在线异常检测方法承担着处理复杂性的负担，而离线深度异常检测方法则遭受了不断发展的数据分布的影响。本文介绍了一个在线深度异常检测的框架ARCU，可以与任何基于自动编码器的深度异常检测方法实例化。它使用两种新颖的技术使用自适应模型合并方法来处理复杂而不断发展的数据流：概念驱动的推理和漂移感知模型池更新；前者检测到最适合复杂性的模型组合的异常，后者会动态调整模型池以适合不断发展的数据流。在具有高维和概念拖延的十个数据集的全面实验中，Arcus提高了基于最先进的自动编码器的流媒体变体的异常检测准确性，并提高了最新的方法和最新的方法。 ART流动异常检测方法的分别为22％和37％。

translated by 谷歌翻译

CFA: Coupled-hypersphere-based Feature Adaptation for Target-Oriented Anomaly Localization

Sungwook Lee , Seunghyun Lee , Byung Cheol Song

分类：计算机视觉 | 机器学习

2022-06-09

长期以来，在行业中广泛使用异常定位。先前的研究集中在近似于正常特征的分布而不适应目标数据集的情况下。但是，由于异常定位应精确区分正常和异常特征，因此缺乏适应性可能会使异常特征的正态性高估。因此，我们提出了基于耦合的 - 希普尔特征适应（CFA），该功能适应（CFA）使用适合目标数据集的功能来完成复杂的异常定位。 CFA由（1）一个可学习的补丁描述符组成，该描述符可学习和嵌入面向目标的功能以及（2）可扩展的内存库，独立于目标数据集的大小。并且，CFA采用转移学习以增加正常特征密度，因此可以通过将贴片描述符和记忆库应用于预训练的CNN来清楚地区分异常特征。所提出的方法在定量和质量上优于先前的方法。例如，它提供的AUROC分数为99.5％，在MVTEC AD基准的异常定位中提供98.5％。此外，本文指出了预训练的CNN的偏置特征的负面影响，并强调适应目标数据集的重要性。该代码可在https://github.com/sungwool/cfa_for_anomaly_localization上公开获得。

translated by 谷歌翻译

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Seunghyun Lee , Byung Cheol Song

分类：机器学习 | 人工智能 | 计算机视觉

2022-03-05

常规的基于NAS的修剪算法旨在找到具有最佳验证性能的子网络。但是，验证性能并不能成功代表测试性能，即潜在性能。另外，尽管对修剪的网络进行微调以恢复性能下降是不可避免的过程，但很少有研究解决了这个问题。本文提供了一种新颖的合奏知识指导（EKG），以一次解决这两个问题。首先，我们在实验上证明损失格局的波动可以是评估潜在性能的有效指标。为了以低成本搜索具有最平稳损失景观的子网络，我们采用心电图作为搜索奖励。用于以下搜索迭代的EKG由临时子网络的集合知识，即子网络评估的副产品组成。接下来，我们重复使用心电图为修剪的网络提供温和的信息指导，同时微调修剪的网络。由于在两个阶段中都将心电图作为内存库实施，因此需要可忽略的成本。例如，当修剪和训练Resnet-50时，只需315 GPU小时即可删除约45.04％的拖鞋而没有任何性能降解，即使在低规格的工作站也可以运行。实施的代码可在https://github.com/sseung0703/ekg上找到。

translated by 谷歌翻译

Vision Transformer for Small-Size Datasets

Seung Hoon Lee , Seunghyun Lee , Byung Cheol Song

分类：计算机视觉

2021-12-27

最近，将变压器结构应用于图像分类任务的视觉变压器（VIV）具有优于卷积神经网络的优势。然而，使用诸如JFT-300M的大型数据集的预先训练的VIT结果的高性能和其对大型数据集的依赖性被解释为由于低地位感应偏差。本文提出了移动的贴片标记（SPT）和地区自我关注（LSA），有效解决了缺乏地区归纳偏差，使其即使在小型数据集上也能从划痕中学习。此外，SPT和LSA是通用且有效的附加模块，可轻松适用于各种VITS。实验结果表明，当SPT和LSA都应用于VITS时，性能在微小的想象中平均提高2.96％，这是一个代表性的小型数据集。特别是，由于所提出的SPT和LSA，Swin Transformer达到了4.08％的压倒性的性能提高。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Extraction of Coronary Vessels in Fluoroscopic X-Ray Sequences Using Vessel Correspondence Optimization

Seung Yeon Shin , Soochahn Lee , Kyoung Jin Noh , Il Dong Yun , Kyoung Mu Lee

分类：计算机视觉

2022-07-28

我们提出了一种从荧光X射线序列中提取冠状动脉血管的方法。给定源框架的血管结构，随后框架中的血管对应候选者是由新型的分层搜索方案生成的，以克服孔径问题。最佳对应关系是在马尔可夫随机字段优化框架内确定的。由于对比剂的流入，进行后处理以提取新近可见的血管分支。在18个序列的数据集上进行的定量和定性评估证明了该方法的有效性。

translated by 谷歌翻译

Enhanced Correlation Matching based Video Frame Interpolation

Sungho Lee , Narae Choi , Woong Il Choi

分类：计算机视觉

2021-11-17

我们提出了一种称为基于DNN的基于DNN的框架，称为基于增强的相关匹配的视频帧插值网络，以支持4K的高分辨率，其具有大规模的运动和遮挡。考虑到根据分辨率的网络模型的可扩展性，所提出的方案采用经常性金字塔架构，该架构分享每个金字塔层之间的参数进行光学流量估计。在所提出的流程估计中，通过追踪具有最大相关性的位置来递归地改进光学流。基于前扭曲的相关匹配可以通过排除遮挡区域周围的错误扭曲特征来提高流量更新的准确性。基于最终双向流动，使用翘曲和混合网络合成任意时间位置的中间帧，通过细化网络进一步改善。实验结果表明，所提出的方案在4K视频数据和低分辨率基准数据集中占据了之前的工作，以及具有最小型号参数的客观和主观质量。

translated by 谷歌翻译

Neural Latents Benchmark '21: Evaluating latent variable models of neural population activity

Felix Pei , Joel Ye , David Zoltowski , Anqi Wu , Raeed H. Chowdhury , Hansem Sohn , Joseph E. O'Doherty , Krishna V. Shenoy , Matthew T. Kaufman , Mark Churchland

分类：机器学习

2021-09-09

神经记录的进展现在在前所未有的细节中研究神经活动的机会。潜在的变量模型（LVMS）是用于分析各种神经系统和行为的丰富活动的有希望的工具，因为LVM不依赖于活动与外部实验变量之间的已知关系。然而，目前缺乏标准化目前阻碍了对神经元群体活性的LVM进行的进展，导致采用临时方式进行和比较方法。为协调这些建模工作，我们为神经人群活动的潜在变量建模介绍了基准套件。我们从认知，感官和机动领域策划了四种神经尖峰活动的数据集，以促进适用于这些地区各地的各种活动的模型。我们将无监督的评估视为用于评估数据集的模型的共同框架，并应用几个显示基准多样性的基线。我们通过评估释放此基准。 http://neurallatents.github.io.

translated by 谷歌翻译